This dataset comes from kaggle web site https://www.kaggle.com/roshansharma/sanfranciso-crime-dataset and I will use it to develop a data product, as final project.We subset the dataset for 3 differents graphs, Crime type (df2), Crime in each district (df3) and most dangerous streets (df4), for the sake of visualization graph we have to give more importance to those cases that have more than 200 or 100 (n>200 or n>100)
PoliceData <- data.frame(latitude, longitud, category,incidntNum,districts,dayoftheweek,resolution,street) FinalData <- PoliceData[complete.cases(PoliceData), ] df <- distinct(FinalData, incidntNum, .keep_all = TRUE) df2 <- count(df, category, sort = TRUE) dfdistrict <- subset(df,districts !='') df3 <- count(dfdistrict,districts,sort = TRUE) df2 <- subset(df2,n>100) df4<-count(df,street,sort = F) dfsub<-subset(df4,n>200)